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Finding the weight distributions of block codes is a problem of theoretical and practi- 
cal interest. Yet the weight distributions of most block codes are still unknown except for 
a few classes of block codes. In this article, by using the inclusion and exclusion principle, 
an explicit formula is derived which enumerates the complete weight distribution of an 
(n,k,d) linear code using a partially known weight distribution. This expression is analo- 
gous to the Pless power-moment identities-a system of equations relating the weight dis- 
tribution of a linear code to the weight distribution of its dual code. 

Also, an approximate formula for the weight distribution of most linear (n,k,d) codes 
is derived. It is shown that for a given linear (n,k,d) code over GF(q), the ratio of the 
number of codewords of weight u to the number of words of weight u approaches the 
constant Q = q~( n_k ) as u becomes large . A relationship between the randomness of a 
linear block code and the minimum distance of its dual code is given, and it is shown that 
most linear block codes with rigid algebraic and combinatorial structure also display cer- 
tain random properties which make them similar to random codes with no structure at 
all 


I. Introduction 

Finding the weight distribution of block codes is a problem 
of theoretical and practical interest. When an incomplete 
decoding algorithm is used (e.g., bounded distance decoding), 
the probabilities of correct decoding, decoding error, and 
decoding failure can all be expressed in terms of the code’s 
weight enumerator [2] . 

Let C be a linear (n,k y d) code over GF(q), and C 1 be its 
(n,n - k,d l ) dual code. Let G be the generator matrix of C. 
Let the number of codewords of weight u be denoted by A u . 
MacWilliams [3] showed that the weight enumerator of the 
dual C l of a linear code C is given by a linear transformation 


of the weight enumerator of C. Pless [1] introduced the 
power-moment identities-a system of equations relating the 
weight distribution of a linear code to the weight distribution 
of its dual code. In this article, by using the inclusion and 
exclusion principle, it is shown in Section III that the com- 
plete set of y4 u ’s, 0 < u < n, can be generated if only the 
partial set of v4 u ’s, d < u < n-d 1 , is known. 

By modifying the techniques used in the above derivation, 
an approximate formula for A u of most (n,k, d) nonbinary 
linear codes is derived. This formula, together with the approx- 
imate formula for A u of binary linear code derived by Kasami 
et al. [4], shows that the distribution q (") (4 - l) u is a 
close approximation to^4 u for most (n,k,d) codes over GF(q). 
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The intrinsic randomness of a linear (n,k,d) block code over 
GF(q) is implicit in the Pless identities which show that the 
vth binomial moment, for v = 0, 1 , . . . , d 1 - 1 , is independent 
of the code and is equal to that of the whole vector space, i.e., 
the («,«, 1) code ( GF(q)) n . In this article, an explicit relation- 
ship between the randomness of a linear block code and the 
minimum distance of its dual code is given, and it is shown 
that for large u, 

no. of codewords of weight u total no. of codewords _ 
no. of vectors of weight u total no. of vectors 

q < "-*> d = ef Q (1) 

Equation (1) states that if the vector space (GF(q)) n is parti- 
tioned into weight classes according to the Hamming weights 
of the vectors, then the ratio of the number of codewords in 
a weight class to the number of vectors in that weight class 
approaches a constant Q, where Q is the ratio of the size of the 
code to the size of the whole vector space (GF(q)) n . This 
remarkable relationship shows that most linear block codes 
with rigid algebraic and combinatorial structure also display 
certain intrinsic random properties which make them similar 
to random codes with no structure at all. 

II. Mathematical Preliminaries 

In this section combinatorial and coding techniques re- 
quired to derive the results in later sections are introduced. 

A. Principle of Inclusion and Exclusion [5] 

Let x be a set of N objects, and ^(1), P( 2), . . . ,P(u) be a 
set of u properties. Let A r (q, z 2 , . . . , i r ) be the number of 
objects with properties P(i 2 )i • • ■ , P { /,)• The number of 
objects j V(0) with none of the properties is given by 

jV(0) = n- jyv(o + y N (y 2 ) + • • • 

i q < /*2 

+ (-!)'■ E A f (V'r--’P + -- 

/ j < ^ ... < i r 

+ (-l) M A^(l , 2, 3, . . . , u) (2) 

There are u + 1 terms in the RHS of Eq. (2), with the 0th term 
representing the total number of objects in X- If the RHS of 
Eq. (2) is truncated at the rth term, where r is even, the trun- 
cated sum represents a lower bound on ^(0). Similarly, if the 


RHS of Eq. (2) is truncated at an odd term, an upper bound 
on A(0) is obtained. Thus the maximum error magnitude 
introduced by the inclusion and exclusion formula by truncat- 
ing the sum at the rth term does not exceed the magnitude of 
the rth term. This fact will be used later to upper bound the 
magnitude of the errors of the approximate weight distribu- 
tion formula. 

B. Facts on Coding Theory 

A linear (n,k,d) code over GF(q) can be generated by a 
k X n generator matrix G, not necessarily unique and such that 
rank(G) = k. Let / be the maximum number such that no / or 
fewer columns of G add to zero. Then 

/ < k (3) 

Equality in Eq. (3) is achieved in the case of maximum dis- 
tance separable (MDS) codes. Since G is the parity-check 

matrix of C 1 , / = d l - 1. Let colq , col r - 2 , . . . , col^., be any j 
particular columns of G, / < / < k. It is obvious that there 
exists a k X n generator matrix G f of C and a k X k non- 
singular matrix K such that 

G f = KG (4) 

and col,- , col, 2 , . . . , col^ of G # form a kXj submatrix of the 
form (T.J. This fact guarantees that the number of codewords 
with zeros on the qth, / 2 th, . . . , / ; th coordinates equals q k ~> 
for / < l. 

III. Derivation of Formula 

Let c be a codeword of C with Hamming weight u,u> n-L 
Let the coordinates of c be indexed by {0,1,. . . , n - 1}. Then 
c has v zeros (v < /), where v = n - u. Let V be a set of v coor- 
dinates, I V\ = v . Let . . . , ij] C {0,1 n - 1 } - V 

be a set of j coordinates. Define S{i x ,i 2 > ••■>//) = {.£. : C 

and c_ has zeros in V U {/ p / 2 , ..., /*■}}. A codeword c €S(i v 
i 2 /'.) always has at least v + j zeros. Let 

t ,’T. E is v, 

W\~v q </ 2 <...< ij 

That is, Tj is the /th term in the inclusion and exclusion 
formula. From the discussion in Section ILB, the number of 
codewords in S(i v i 2 , . . . , /j) is 

\S(i v i 2 , I = q k - v -i for 0<j<l-v (5) 
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There are (”) ways to choose V from 0, 1, 1 and for 
each choice of V there are (“)(« = w - v) ways to choose q , 
ij, . . . , ^ from the remaining set of u-n-v coordinates. 


T = 

7 



for 0</</-v 


( 6 ) 


For / - v + 1 </ < n - d - v, the number of zeros in the code* 
words of S{i v i 2 ' ■ • • » exceeds / and therefore 7^ cannot be 
expressed using Eq. (5). In this case 7J is evaluated by count- 
ing the number of S(i v i 2 , . . . , /)) each codeword can contrib- 
ute to. For a given v and /, the codewords that can contribute 
to Tj are the zero codeword and the codewords of weight 
n For the zero codeword, there are (” ) 

ways of choosing V and (“) ways of choosing the remaining/ 
zero coordinates. For a codeword of weight n - m (m zeros), 
there are (™ ) ways to choose V and ( ) 

ways to choose the / remaining zero coordinates. There are 
A n _ m codewords of weight n - m. Thus 



for /-v+l</<rt-af-y (7) 


true for all u, 0 < u < n. For d < u < n - l + 1, Eq. (10) is 
reduced to the identity A u =A U (proof omitted). 

It is observed from the above that only the derivation of 
7^’s in the range / - v + 1 </<«-c/-v(/+ 1 < v + / <« - <i, 
where v + j is the number of zeros in a codeword) requires 
prior knowledge of A n _ m \ (weight enumerator of codewords 
with m zeros), where v+f^m^n-d. Thus the complete set 
of ,4^’s, 0 < u < n , can be generated if only the partial set of 
A u 's, d < u < n - d l , is known. An example which generates 
the weight distribution of the (7,4) Hamming code is given in 
Appendix A. 

The above results are summarized in the following theorem 
and corollary. 

Theorem 1. If C is an (n,k,d) code over GF(q ), then 


A u = (-l) y T. for n-d l +\<u<n 

7-0 


where 


? - C)(“)^-' f °' 


For n - d - v + 1 < n - v, the number of zeros in the code- 

words of S (q , / 2 , . . . , ij) exceeds n - d + 1. Since the code 
has minimum distance d, S(i l , / 2> . . . , /)) - {0_ }. Thus, 


voo-ior;)- 


\S(i v i 2 ij) I = 1 


for /-v + l</<A 2 -d-v 


for n - d - v + 1 (8) 

As in the case for 0 </ < / - v, there are (" ) ways to choose V 
and for each V there are (“) ways to choose q , / 2 , . . . , /). Thus 


= ( ) ( ) for rt-Gf-v+l</<«-v (9) 


T = ( ) ( ) for 


Corollary l.Ifzl u ,d<M<fl- c/ 1 , of an (n,/c,</) lin- 
ear code C over GF(q ) are given, the remaining A u 's, n - d l 
+ 1 < u < can be evaluated explicitly using the equations 
given in Theorem 1 . 


By the principle of inclusion and exclusion, the number of 
codewords of weight u (v zeros), which is denoted by A , is 
given as follows: 

u 

A u = H (-o' 5 o°) 

7-0 

Although the above derivation is based upon the assumption 
that u > n - /, it is not hard to show that Eq. (10) is indeed 


IV. Approximate Formula 

Theorem 1 and Corollary 1 in Section III enable one to 
enumerate the complete weight distribution A u , 0 < u < rt, 
given that the partial set of A u , d < u < n - d 1 , is known . This 
partial set of A u is required in the calculation of T r I - v + 1 
</<« -d-vAn cases in which knowledge of this partial set 
is not available, one can still derive an approximate formula 
for A u as follows. For a given coordinate set V ,\V\- v, let A' v 
denote the number of codewords with exactly v zeros in V. 
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Using a similar derivation as in Section 111, ^4^ can be repre- is added to and subtracted from Eq. (12), one has 
sented by the inclusion and exclusion principle as follows: 


A' v = 15(0)1 + H) £ IS(ij)l + 


A' 




n-k 


+ e i+ e 2 


(13) 


+ H/ £ 


|5(/,,i 2 , ■■•>',) I 


, t < *2 < ”' < r 


If ( “ v ) q ^ (;_v+i )’ that is ' >*’ u > [(<? + !)/<?] («-/)- 1, E 2 is a 
sum of terms with alternate signs and decreasing magnitude. 
Then \E 2 1 < (^ y ) q k ~ l . Thus 


+ (-ir i 'S 0 1 ,/ 2 ,...,/„_ v ) 

= £ 

/=0 

n-d-v 

+ £ (- 1 )' £ . . . ,/)i 

/=t-K + l 1 1 <1 2 <...<^. 


* =&zi£ +i r 


,n-k 


(14) 


where E - E x + £ 2 and |/f | < 2 (^) can thus be 

approximated by [((? - l) w ] lq n ~ k , and the goodness of approx- 
imation depends on how small the ratio R = £/[(£? - l) w 
X («-*)] is. By using the upper bound on |£|, an upper 
bound on this ratio is given by 


*“ v ./mV 

+ 2 (-o' o» 

/ = 1 





If the above inclusion and exclusion formula is truncated at 
the (/ - v)th term, Eq. (7) is reduced to 

K = £* (-o' (“)«***+*, ( 12 > 

/=0 ' 7 ' 

where 

fl-d-v 

+ £ £ (-1V I50 r <, pi 

/ = /-v+l / 1 <i 2 < --</ 

n~v 

+ £ (-iy 

/= n-d-v+\ 

From the discussion in Section II.A, \E { | <( / “ v ) q k ~ l . If 

e 2 = £ (- i )' (‘V- v - / 

/= 1 ~V V/ 


Since v < /, there are (") = (”) ways to choose v zeros from 
{0, 1, .... n - 1 }. Then>l u can be approximated by the fol- 
lowing expression: 

A u = £ A'y** (”)(<? -if (15) 

\V\ =n-u 

for u > max { n - /, [(q + 1 )lq] (n - l) - 1 }. 

Strictly speaking, the derivation of Eq. (15) is only valid 
for u > max {n - /, [(q + l)lq](n - /) - 1 }. However, it is 
observed that in most cases, q~( n ~ k ) (”) (q - l) u is also a close 
approximation to A u for u considerably smaller than n - l (as 
in the case of Reed-Solomon codes). The upper bound of R 
derived above has a denominator term (q - 1)^ and this indi- 
cates that this approximation formula is good for nonbinary 
linear codes, and is not useful for binary linear codes. The 
looseness of this approximation for binary linear codes is 
best illustrated by extended binary codes which only have 
even weights. However, it is observed that for most extended 
binary codes, the number of codewords of weight u, where u 
'is even and is not close to 0 or n, can be approximated by the 
sum of two adjacent binomial coefficients 2~( n ~ k ~ l ) 

+ ) (”'“ 1 ). This is obvious since an (n,k,d) extended 

binary code can always be constructed from an (n - \,k,d - 1) 
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binary code by appending each codeword with a parity bit. 
The weight distribution and its approximation for the (128, 
113,6) binary extended BCH code are given in Fig. 1. In the 
case of binary primitive codes, Kasami et al. [4] generalized 
Sidel’nikov’s approach [6 J and showed that the weights of 
most binary primitive codes have approximate binomial dis- 
tribution. For nonbinary linear codes, the upper bound on 
R shows that the approximation in Eq. (11) is particularly 
good for codes with large alphabet sets. The upper bound on 
R for the (3 1 ,15,1 7) Reed-Solomon code over GF(32) is given 
in Fig. 2. The weight distribution and its approximation (using 
Eq. 11) of the (31,15,17) Reed-Solomon code are given in 
Fig. 3. 

V. Randomness of a Linear Block Code 

In this section, the approximation for the weight distribu- 
tion of linear codes will be used to investigate the randomness 
of linear block codes. It was shown in [7] that in the case of 
MDS codes, where both the weight distribution of the codes 
and the weight enumerators of decodable w'ords are known, 
the following relationships are obtained: 

no. of MDS codewords 

of weight u total no. of MDS codewords 

no. of vectors of weight u total no. of vectors 


_ ^-( n~k) 

total no. of codewords 
total no. of vectors 

( 18 ) 

for u > max {«-/,[(<? + 1 )/q] (n - /) - 1 } . As was discussed 
in Section III, in the case of nonbinary block codes, the 
goodness of the approximation in Eq. (14) depends upon the 
ratio R = E/[(q - l) u q -( n ~ k )] , which is upper bounded by 
[2 t )q k ' 1 ] l(q - l) u - A larger weight u and/or a larger d l of 
C correspond to a better approximation of the weight distribu- 
tion of C by the formula (") (q - 1 ) u . This in turn implies that 
if d l of C is large, the ratio of the number of codewords of 
weight u to the number of words of weight u approaches 
q-(n-k) more quickly as u gets large. This result is. in some 
way, analogous to Pless power-moment identities [lj which 
state that for a linear {n,k s d) block code, there are d l (0, 1, 
. . . , d l - 1) binomial moments that are independent of the 
code and are equal to the binomial moments of the whole 
vector space. 


no. of codewords of weight u 
no. of vectors of weight u 


q~^ k) (16) 

and 

no. of decodable words 

of weight u total no. of decodable words 

no. of vectors of weight u total no. of vectors 

q- {n ~ k) V n (t) (17) 

where V n (t) is the volume of the Hamming sphere of the 
codes. In this article, by using the approximation in Eq. (1 1), 
Eq. (12) is generalized to all linear block codes. That is, for an 
(n,k,d) linear code C, 


VI. Conclusion 

In this article, by using the inclusion and exclusion princi- 
ple, an explicit formula which enumerates the complete weight 
distribution of an (n,k,d) linear code using a partially known 
weight distribution is derived. Using similar combinatoric and 
coding techniques an approximate formula for the weight dis- 
tribution of most linear (n,k,d) codes is derived. A relationship 
between the randomness of a linear block code and the mini- 
mum distance of its dual code is given, and it is shown that 
most linear block codes with rigid algebraic and combinatorial 
structure also display certain random properties which make 
them similar to random codes with no structure at all. The 
results presented can help to simplify the calculations of the 
probabilities of correct decoding, decoding error, and decod- 
ing failure which are all expressed in terms of the code’s 
weight enumerator. 
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u 

Au (Exact) 

A' u (Approx . ) 

0 

1 . 000e+000 

1 .000e + 000 

2 

0 . 000e + 00 0 

4 . 961e-00 1 

4 

0 . 000e+000 

6 . 511e+002 

6 

3 . 4 1 4e+005 

3 . 31 0e+00 5 

8 

8 . 72 9e + 007 

8 . 726e+007 

10 

1 . 38 4e + 0 10 

1 . 385e+010 

12 

1. 448e+012 

1 . 448e+012 

14 

1 . 061e+014 

1 . 061e + 014 

16 

5 . 697e+0 15 

5 . 697e+015 

18 

2 . 315e+0 17 

2 . 315e+017 

20 

7 . 303e+0 18 

7 . 303e+018 

22 

1 . 827e+020 

1 . 827e+020 

24 

3. 683e+021 

3 . 683e+02 1 

26 

6 . 07 0e + 022 

6 . 070e + 022 

28 

8 . 272e+023 

8 . 2 72e + 023 

30 

9 . 4 1 3e+024 

9 . 413e + 024 

32 

9 . 02 0e + 025 

9 . 0 20e + 02 5 

34 

7 . 332e+026 

7 . 332e + 02 6 

36 

5 . 08 7e + 027 

5 . 087e+027 

38 

3 . 02 9e + 028 

3 . 029e+028 

40 

1 . 555e+029 

1 .555e + 029 

42 

6 . 91 4e + 02 9 

6 . 915e+029 

44 

2 . 67 2e + 0 30 

2 . 672e+030 

46 

8 . 998e+030 

8 . 9 98e+030 

48 

2 . 64 9e + 031 

2 . 64 9e + 031 

50 

6. 834e+031 

6 . 834e+031 

52 

1 . 54 8e + 032 

1 . 548e+032 

54 

3 . 082e+032 

3 . 0 82e+032 

56 

5 . 40 6e + 032 

5 . 4 06e+032 

58 

8 . 359e+032 

8 . 359e+032 

60 

1 . 14 le+033 

1 . 141e+033 

62 

1 . 374e+033 

1 . 374e+033 

64 

1. 462e+033 

1 . 4 62e+033 

* Au = 0 
**Au = A_ 

for odd u. 

{ 128-u } and 

A' u — A' _{ 128 -u } . 




Fig. 1. Weight distribution and its approximation off the (128, 113,6) 
BCH code. 


u 

R 

16 

2 . 7 494314 4e-024 

17 

1 .50775270e-024 

18 

4 . 37734 67 3e-025 

19 

8 .942 9658 6e -026 

20 

1 .44241389e-026 

21 

1 . 95423782e-027 

22 

2 .3114641 9e-028 

23 

2 . 4 4 993596e-02 9 

24 

2 . 370 90 920e-030 

25 

2 . 12446863e-031 

26 

1 .78181347e-032 

27 

1 . 4 1082 139e-033 

28 

1.06190837e-034 

29 

7.64152968e-036 

30 

5 . 282 1544 7e-0 37 

31 

3 . 521 43591e-0 38 


Fig. 2. Upper bound on R for the 
(31,15,17) RS code. 


u 

Au (Exact) 

A' u (Approx . ) 

0 

1 . 000e+000 

8 . 272e-025 

1 

0 . 000e + 0 00 

7 . 94 9e-022 

2 

0 . 000e + 000 

3 . 696e-0 1 9 

3 

0 . 000e+000 

1 . 108e-016 

4 

0 . 000e + 000 

2 . 404e-01 4 

5 

0 . 000e+000 

4 . 024e-012 

6 

0 . 000e+000 

5 . 405e-010 

7 

0 . 000e+000 

5 . 984e-00 8 

8 

0 . 000e+000 

5 . 565e-006 

9 

0 . 000e+000 

4 . 409e~004 

10 

0 . 000e+000 

3 . 007e-002 

11 

0 . 000e+000 

1 . 7 80e+000 

12 

0 . 000e+000 

9 . 195e+001 

13 

0 . 000e+000 

4 . 166e+003 

14 

0 . 000e+000 

1 . 660e+005 

15 

0 . 000e+000 

5 . 833e+00 6 

16 

0 . 000e+000 

1 . 808e+008 

17 

8 . 221e+009 

4 . 94 6e+009 

18 

9 . 591e+010 

1 .193e+011 

19 

2 . 62 9e + 0 12 

2 . 530e+012 

20 

4 . 67 6e + 013 

4 . 7 05e+013 

21 

7 . 64 6e + 0 14 

7 . 640e+014 

22 

1 . 07 6e+01 6 

1 . 077e+016 

23 

1 . 30 6e+017 

1 . 306e + 017 

24 

1 . 34 9e+018 

1 . 34 9e+0 1 8 

25 

1 . 171e+019 

1 . 171e+019 

26 

8 . 380e+019 

8 . 380e + 0 1 9 

27 

4 . 81 le + 020 

4 . 8 lle+020 

28 

2 . 130e + 021 

2 . 130e+02 1 

29 

6 . 832e + 02 1 

6 . 8 32e + 02 1 

30 

1 . 412e+022 

1 . 4 12e+022 

31 

1 . 412e+022 

1 . 4 12e+022 


Fig. 3. Weight distribution and its approximation for the 
(31 , 1 5, 1 7) RS code over GF(32). 
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Appendix A 

An Example Which Generates the Complete Weight Distribution of the 
(7,4,3) Hamming Code from an Incomplete Weight Distribution 


This example illustrates the use of Theorem 1 to evaluate the complete weight distribution of the (7,4,3) Hamming code C. It 
is given that C has minimum distance d - 3 and C 1 has minimum distance — 4. According to Theorem 1 it is also required to 
know the partial weight distribution A u , 3 = d < u < n - d 1 =3. It is given that A 3 = ! .A^,A S ,A 6 , and^4 7 are now evaluated 
as follows: 

1 . u = 4 (v = 3). In this case T 0 , T t . T 2 , T 3 , and T 4 are (’) (*)2, (]) (?) + (?)7, (]) (?), (’) (*), and (’) (?), respectively. Thus, 

A a = 70- 168 + 210- 140 + 35 = 7 

4 

2 . u = 5 (p = 2 ). In this case T 0 , T v T 3 , T 3 , T 4 , and T s are (’) (*) 2 2 , (’) (J)2, (?) (*) + (?)7, (?) ( 3 ), (?) (4)' and 0 <*>• 
respectively . Thus, 


A = 84- 210+ 252 -210+ 105 -21 = 0 


3. « = 6 (v = 1). In this case T 0 , 7J , T 2 , T 3 . T 4 . T s , and T 6 are (? ) (‘ )2 3 , (? ) (?)2 2 , (? ) ( 2 6 )2, (? ) (® ) + (? )7, (? ) (‘ ), (] ) (* ), and 
(?) respectively. Thus, 


A „ = 56 -168 + 210 - 168 + 105 -42 + 7 = 0 
6 


4 . u = 7 (*- 0). In this case T 0 , T, , T 2 , T 3 , T 4 , T S ,T 6 , and T 7 are (?) 2 4 , (?) 2 a , (?)2 2 , (?) 2 , (?) + (?)7, (?), (?), and (?), respec- 
tively. Thus, 


= 16-56 + 84 - 70 + 42 ”21 + 7-1 = 1 
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